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Abstract. Consider a birth and death chain to model the number of types of a given virus. Each type 
gives birth to a new type at rate A and dies at rate 1. Each type is also assigned a fitness. When a death 
occurs either the least fit type dies (with probability 1 — r) or we kill a type at random (with probability 
r). We show that this random killing has a large effect (for any r > 0) on the behavior of the model when 
A < 1. The behavior of the model with r > and A < 1 is consistent with features of the phylogenetic tree 
of infiuenza. 
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1 Introduction. 

Consider the following model for the evolution of a virus. The model depends on two parameters, A > 
and r S [0, 1]. We think of A as the mutation rate. The number of types at time t is denoted by X{t), a 
birth-death process which makes transitions 

{n + 1 at rate n\ for n > 1 
71 — 1 at rate n for n>2 

(the number of types is never less than one). Each virus type has a fitness 0, chosen at random from the 
uniform (0,1) distribution when it is created (so each new type is different from all previous types). When a 
type dies the type that is chosen to die is, with probability r, selected uniformly among the existing types, 
and with probability 1 — r the type with minimal fitness. We will say that with probability r a random 
killing occurs. 

The model with r = (the least fit type type is always killed) was introduced by Liggett and Schinazi in 
[7]. Several articles have since been written on closely related models, see [3], [5], [6] and [8]. "Kill the least 
fit" models go back to at least [2]. The model with random killing (i.e. r > 0) is a natural extension for at 
least two reasons. From a modeling perspective "Kill the least fit" is quite natural. However, assuming that 
this is always the case is not. Random events should occasionally prevent this transition from happening. 
Furthermore, from a mathematical perspective it seems interesting to study the effect of small random 
perturbations of the basic model. As we will see they can have major effects on the behavior of the model. 



We are interested in 



3t = <Pt 



the maximal fitness of the types alive at time i. 



at — al — the age of the type with maximal fitness at time t 

(if a type is created at time s then its age at time t > s is t — s). We start the process with a single individual. 
We assume that its fitness (po is uniformly distributed on (0,1), and initially we take oq — 0. 

Let ^ denote weak convergence and — >-p denote convergence in probability. The following theorem 
summarizes the main results of [7]. 
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Theorem 1 ([7]). Assume r — 0, and Y is uniformly distributed on the interval (0,1). 
(a) If X < 1 then at/t =^ Y as t ^ oo. 
(h) If X > I then at/t — >p as i —> oo. 

When A < 1, X(t) converges in distribution to its stationary distribution, and hence at any given time 
there will not be many types. In this case, (a) above shows that the fittest type at time t will have been 
around for order of time t. As noted in [7], this is consistent with the observed structure of an influenza 
tree. When A > 1, X{t) tends to infinity as t — )• oo, and (b) shows that the fittest type at time t has been 
around only for only o{t) time. As noted in [7], this is consistent with the observed structure of an HIV tree. 
In the critical case A = 1 we have something inbetween these two pictures. It is easy to see that in all cases 
the maximal fitness 1 as i oo. 

Theorem 1 shows that the model with r = can, by adjusting A, describe rather different evolutions. 
Nevertheless, it has some limitations. The maximal fitness always tends to 1, and for A < 1 the age at tends 
to infinity. As shown below, the model with random killing (r > 0) allows for the possibilities that (j)t -/^ 1 
and at -/^ oo. 

Before proceeding to our results for the r > case we resolve one question left open by Theorem 1. 
Namely, (b) leaves open the two possibilities: at is (stochastically) bounded as t ^ oo, or at — > oo. It turns 
out that at does not tend to infinity, instead it converges in distribution. For the sake of completeness, we 
include the behavior of the maximal fitness in the following result. 

Theorem 2. Assume r = 0, and let £ be a mean one exponential random variable. 

(a) For A > 0, —> 1 a.s. as t oo. 

(b) For A > 1, at =► ^ ^ 

We turn to the case of random killings (r > 0) and focus on the A < 1 case. We see that the behaviors 
of the maximal fitness and age processes are quite different from the r = case. 

Theorem 3. Assume r > and A < 1. Then 

(a) (j)t converges in distribution as t oo to a nondegenerate limit law, and 

(b) at converges in distribution as t ^ oo to a nondegenerate limit law. 

Theorem 3 is consistent with features of the influenza phylogenetic tree. The most fit type lasts a finite 
random time and then is replaced by a new most fit type and so on. As desired at does not go to infinity 
with t and 0t does not go to 1. Instead they converge to nondegenerate limits. 

Turning to the r > 0, A > 1 case, our results are less complete. We can show that the fitness (t)t tends to 
one as t — > oo, but we cannot show, as we conjecture, that the age at does not tend to infinity. 

Theorem 4. For r > and \ > 1, (f>t -^p 1 as t ^ oo. 

In the next section we give the proof of Theorem 2. In Section 3 we give a construction that we use to 
prove Theorem 3. The construction allows us to write down a renewal type description of the limit laws for 
both the fitness and age processes. In Section 4 we use a different construction to prove Theorem 4. 

2 Proof of Theorem 2 

Let us dispense with the easy convergence 1. Let B{t) be the number of types created by time i, 

and let iYi , Z//2 , . . . be the successive iid uniform (0, 1) random variables created as the process evolves. Then 
(pt = max{Z//i, . . . ,KB{t)}- It is easy to see that max{Z//i, . . . ,Un} 1 a.s. as n 00. Since B(t) 00 a.s. 
we get </)t — )■ 1 a.s. 
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For (b), fix A > 1 and recall the notation of Section 3 of [7]. Following the notation there, let r„ be the 
first time Xt reaches n, let N{t) — sup{n : r„ < t}, and set 

/ ^ log n 

We need an improvement of Lemma 1 of [7] . 

Lemma 1. With probability one, Imit^ac N(t)e^^^^^^*^ — 6^^^^-'^)^°= , a strictly positive finite limit. 

Proof. It was shown at the end of the proof of Lemma 1 in [7] that C(^) ~^ Coo a.s. as n — > oo for some finite 
random variable Coo- Since N{t) — > oo as t — > cxd we also have (^{N{t)) Coo a.s. as i — > oo. By definition, 

TnH) <t< 7Ar(t)+l (2.1) 

or 

log{N{t))-{X-l)t<-iX-l)aN{t)). 
This implies N{t)e~'^^^^^'' < e-^^-^^'^^^f*)) and therefore 

limsupiV(i)e"(^"^)* < e-(^-i)C- a.s. 
To get an inequality in the reverse direction we note that (2.1) imphes 

or 

log{N{t) + 1) - (A - l)t > -(A - l)C(iVW + 1). 
This implies {N{t) + l)e-(-^"i'* > e-(^-i)';(^(*)+i) and therefore 

liminf iV(t)e~(^"^)* > e^^'^-^^'^- a.s. 

t— >oo 

This completes the proof, since ^oo is positive and finite with probability one. □ 

When 7' = the maximal fitness 0t is increasing in t. This implies that for s < t, at > t — s ii and only 
if 0s = (pt. Let Sn be the number of types produced up to time T„. By (1) and (2) in [7], 



E 



S 



N{s) 



S 



N{t) + 1 



, N{s) < N{t) < P{ct)s = cj^u N{s) < N{t)) < E J^^^N{s) < N{t) . (2.2) 



S 



N{t) 



Fix u > and let s = t — u. By Lemma 4, P{N(s) < N{t)) ^ 1 as t ^ cxa, so it suffices to prove that both 
the left-side and right-side of (2.2) converge to e"*^^^^^". 

It was shown in [7] that Sn/n converges a.s. to a finite positive limit as n oo. By this fact, N(t) — > oo, 
and Lemma 1, 

SNis) + l ^ Sn{s) + 1 N{t) N{s) + 1 ^ ^-(X-l)u ^ 

Snu) N{s) + 1 Smu) N{t) 

It follows that the right-side of (2.2) converges to e^^'^^^-'" as i oo. A similar argument handles the 
left-side of (2.2). This completes the proof of Theorem 2. 
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3 Proof of Theorem 3. 

Throughout this section < r < 1 and < A < 1 are fixed. We first extend the notation of Section 2 of [7] 
making the foliowing definitions and observations. 

(1) Put To = and for n > 1 let r„ be the time of the rtth return of X{t) to state 1. The "interarrival 
times" times {T„ — Tn-i, n > 1} are iid random variables. . 

(2) For n > 1 let ^„ be the duration of the nth sojourn time in state 1, 

e„ = inf{< > T„_i : Xt ^ 1}. 

The random variables {(,n,n > 1} are iid exponential with parameter A. Note also that for n > 
(7{Ta, T„) is independent of cr(f„+i, ^„+2, . . . ). 

(3) For n > 1 let w„ be the uniform random variable created at time Tn^i when X{t) jumps from 1 to 
2. At time T„_i + ^„ there are two types, with fitnesses (/)(T„_i), m„. The {it„,n > 1} are iid uniform 
(0,1) rv's, independent of the sequences {T„,n > 0} and {^„,n > 1}. 

(4) For n > 1 let ?]„ be the duration of the sojourn time in 2 starting at time r„_i + 

r/„ = inf{t>T„_i+e„:^t7^2}. 

The random variables {?7„, n > 1} are iid exponential with parameter 2A + 2, independent of {^„, n > 1} 
and {u„, n > 1}. Furthermore, (t(Tq, . . . , T„) is independent of cr(?7„+i, ?7„+2, . . . ). 

(5) For n > 1 let = T„_i + £,n + Vn- For aU t e [T!,i_i + T^) here are exactly two types, the fitnesses 
are (/i(T„_i), w„. 

(6) At time T"^ — , if X{t) jumps to 1, with probability r one of the types it„, (/)(T„_i) is chosen to be killed. 
For n > 1 let 

{1 at time T' — , Xt jumps to 1 and the type 0(r„_i) is killed by random killing 
otherwise. 

Note that we do not include in the event {e„ = 1} the possibility that (/)(T„_i) < m„ and the least fit 
type is killed with probability 1 — r. The random variables {£„,n > 1} are iid Bernoulli with mean 

Or r 

> 



2(1 + A) 2 2(1 + A) 

Also, the sequence {e„,n > 1} is independent of the sequence {u„,n > 1}, and <j{Tk,^k,Vk,k < n) is 
independent of cr(e„+i, £„+2, . . . ). 

(7) To consider the return times Tj corresponding to the event {e„ = 1}, put kq = 0, i?o = 0, and for n > 1 
define 

K.n = inf{fc > Hn-i ■ £k = 1} and i?„ = T^^. 
The random variables — Rn-i, n>l} are iid, with /i — ERi e (0, oo] and at the times Rn, n > 1, 

(f'B.n ~ is uniform on (0, 1) 

o-Rn ^ Vk^ is exponential with parameter 2(A + 1). 

The construction is illustrated in Figure 1 below, in which ei = 0, £2 = 1 and i?i = T2. 
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To Ti T2 

Rq Ri 
Figure 1 

By (3.1), at time Ri there is a single type, its fitness has the uniform distribution on (0, 1), and its age has 
the exponential distribution with parameter 2(A + 1). Furthermore, given this information, the distribution 
of our process for t > Ri is independent of what has happened before time i?i . It fohows that if we start at 
time with a single type with fitness uniformly distributed on (0,1) and age exponentially distributed with 
parameter 2(A + 1) then i?i is a regeneration time. The strong Markov property now implies the following 
result. 

Lemma 2. //(/>o is uniformly distributed on (0, 1) and oq is exponentially distributed with parameter 2(A + 2) 
then for t > 0, 

P{(^t <v,Ri<t)= [ P{Ri G ds)P{c^t-s < «), < w < 1, (3.2) 
Jo 



and 



P{at<x,Ri<t)^ [ P{Ri £ ds)P{at-s <x), x > 0. (3.3) 
Jo 



Remark 1. The fitness process does not depend on the age process, so (3.2) holds regardless of the distri- 
bution of flQ. 

In order to make use of (3.2) and (3.3) we will need information on the tail of the distribution of Ri, 
which is provided by our next result. 

Lemma 3. For A < 1 there are constants C, 7 such that P{Ri > t) < Ce~'^^. In particular, E{Ri) < 00. 

Proof. We are going to use Gronwall's inequality. Let X{t) denote X{t) starting at 3 instead of 1, let Ti be 
the first time X{t) reaches 1, and let i?i be defined analogously to By a simple coupling it is clear that 
P(i?i > t) < P{Ri > t) for all t > 0. Let r be the first time X{t) reaches 2 after reaching 0, 

T = inf{i > fi : X{t) ^ 2}, 

and let 77 be an independent exponential random variable with parameter 2A + 2. Finally, let r' = t + 77. By 
the Markov property, 

P{Ri >t) = P{t' > i) + (1 - p) / P(r' G d3)P{Ri > t - s). 

Jo 

It follows now from Gronwall's inequality that 

P{Ri >t)< P{t' > <)e(i-f)^(^'^*) < eP{Ri > t). 
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Since t' = t + fj, it suffices now to prove that t has an exponential tail. 

For the remainder of this argument we amend the dynamics of X(t) to include a transition from 1 to 
at rate 1, and treat as a trap. If we let tq be the first hitting time of 0, then tq > r, so the final reduction 
is to prove that for some constants C, 7, 

P(ro > t\X{0) = 3) = P{X(t) = 0\X{0) 3) < Ce"''*. 

The amended birth-death process X(t) is a continuous time branching process, as shown in Section III. 5 of 
[1], where an explicit expression for the generating function s^P{X{t) = k\X{0) = 1) is given. Setting 

5 = we obtain 

P{X{t) ^ 0\X{0) = 1) = < 
By the branching property, we get P{X{t) ^ 0\X{0) = 3) < 3P{X{t) ^ 0\X{0) = 1) so we are done. □ 

With these facts established we begin the proof of part (a) of Theorem 3. Let F{t) = P{Ri < t), and let 
U — Y^ F'^*^^ be the corresponding renewal function, U{t) = ^„ P{Rn < t). Fix v £ (0, 1) and define 

h,{t) = P(0t < w, i?i > t) and H,{t) = P(0t < v). 

By decomposing the event defining Hy(t) according to the value of and using (3.2), we have 

H^{t)^K{t)+P{(f>t<v,Ri<t)^Kit)+ f H4t-s)F{ds). (3.4) 

Jo 

It follows from Theorem 4.4.4 of [4] that the solution to this renewal equation is given by 

H4t) = f h4t~s)U{ds). (3.5) 
Jo 

We claim that 

hv{t) is directly Riemann integrablc if A < 1. (3.6) 
Given this, a standard renewal theorem (Theorem 4.4.5 of [4]) implies that 

1 /■°° 

Hv{t) — I hy{s)ds as t —7- 00 (3.7) 
M Jo 



or 



1 f°° 

lim P{(f)t <«) = -/ P{(f)s <v,Ri> s) ds 

t->oo ^ 



(3.8) 



(recall that ^ = E{Ri)). For A = 1 we still have (3.5), but not (3.7) since this depends on < cx). 

In view of the fact that P{Ri > t) decays exponentially fast, to prove (3.6) it suffices to prove that h.u{t) 
is a continuous function of t. For s < Het Fs,* be the event that the birth-death process makes no transitions 
in the time interval [s,t\. On F^.t, (/). cannot change, and -Ri > s if and only if Ri > t, so that 



It follows that 



P{{^s <v,Ri>s}n rs,t) = PiWt <v,Ri>t}n f,,*)- 
\h,is) - h,it)\ < Piri,) 

00 

= P{X{s) = fc)(l - e-'=(^+i)(*-'^)) 

k=l 

00 

k=l 

= {t-s){X + l)E{X{s)). 
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For A < 1, sups E{X{s)) < oo, so we have proved that hy is continuous and directly Riemann integrable. 

For Theorem 3(a), we suppose first that qq is exponential with parameter 2(A + 1), so that (3.3) holds. 
Now we follow the previous argument. Fix a; > and define 

g^{t) = P{at <x,Ri> t) and G:,{t) = P{at < x) 

As in the argument for Theorem 2(b), for A < 1 we have 

G.it) = g^{t) + f G^{t ~ s)F{ds) = / g^{t - s)U{ds). (3.9) 



For A < 1, an argument similar to the one for hy{t) shows that gx{t) is directly Riemann integrable, and by 
the renewal theorem 

1 

!s as i — > oo (3.10) 



or 



1 f°° 
Gx{t) - / ga:{s)d 
M Jo 

1 /"^ 

\imP{at<x) = - P{as <x,Ri> s)ds. (3.11) 

Given any oq > 0, by using the same birth-death process and sequence of uniform random variables, we 
may construct an age process dt with the property that 

ht^atiit>Ri. (3.12) 

This is because at time i?i = Tk for some k, the most fit type is the uniform random variable created at 
time Tk + ^k, and has age ijk = a^^ = clr^. After time i?i the two age processes are identical. By (3.12), 

P{at ^ dt) — > as t — !> oo, and therefore for any oq, 

1 

lini P{dt < x) — — / gx{s)ds. (3.13) 

Finally, it is not hard to see that the right-hand side of (3.8) is strictly increasing in v, and the right-hand 
side of (3.11) is strictly increasing in x, so the limit distributions are nondegenerate. 



4 Proof of Theorem 4. 

We start with the case r = 1. In this case, conditional on Xt — k, the set of fitnesses has the same law as 
that of k uniform (0, 1) random variables, and hence 

P{(I}1 < u\Xt = k) = u'' , 0<u<l. (4.1) 

This is because (i) the sequence of uniforms created when Xt jumps is independent of Xt, (ii) when r — 1, 
the type that is killed is independent of the types that are present, and (iii) k uniforms chosen randomly 
from 71 > fc iid uniforms has the law of k iid uniforms. For A > 1, P{Xt < K) ^ as t ^ oo for any K < oo. 
Applying (4.1) we obtain 

(f)} — >j, 1 as i oo. (4.2) 

To handle (pl for < r < 1 we argue that is stochastically larger than (pj. To do this we will use a 
coupling that is based on the following definition and elementary lemma. For positive integers k and sets 
A,Bc (0, 1) such that \A\ = \B\ = fc, write A ^ B ii A has elements ai < • • • < and B has elements 
bi < ■ ■ ■ < bk and 

a, < b, for 1 < i < fc. (4.3) 

Lemma 4. Let A, B C (0, 1) each have k elements, and suppose A ^ B. Then A' ^ B' in each of the two 
cases: 
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(a) A' = AU {w} and B' = B U {w}, where w £ (0, 1) and w AuB. 





(b) k >2, A' is obtained by deleting any element of A and B' is obtained by deleting the smallest element 
ofB. 

In particular, max(_B') > max(j4'). 

Proof. For (a), put oq = bo — and Ok+i = b^+i = 1. Then for some < i < k and < j < fc, 
w £ (ai,ai+i) n where necessarily j < i. Then 



6'. 



It is easy to check that a'^ < b'^ for all I. 

For (b), if Oi is the element deleted from A, then a'^ ~ Oi \i t < i and a'^ — ai+i \i I > i, while b'^ — fe^+i 
for £ > 2. Again, it is easy to check that a'^ < b\ for each t. □ 

Fix < r < 1. To be very clear about the coupling we need we note that our system can be constructed 
from (i) the birth-death process Xt,t > 0, (ii) an iid sequence of uniform (0, 1) random variables > 0, 
(iii) a sequence of iid mean r Bernoulli random variables £„,n > 0, and (iv) independent random variables 
WJ},n, k > 1, P{Wl! — j) = l/k for 1 < j < k. When Xt makes its nth transition up the uniform variable 
Vn is added to the current set of types. If Xt makes it's nth transition down, and there are k types before 
the transition, the least fit type is deleted if e„ = while if £„ = 1 and WJ} — j then the jth largest 
type is deleted. This gives a construction of a set of types at time t,F^{t) = {/f (i), • ■ • , with 
0r=max(i^'-(O). 

Using the same collection of variables we may construct a second set of types F^{t) — {fl{t), . . . , fx{t)i^)} 
as follows. Put F^{0) = F''{0), so certainly F^{0) ^ F''(0). Now suppose F^{t) ^ F''{t) and the elements 
of each set are put in increasing order. If a jump up occurs for the birth process, and w is the value of 
the uniform random variable added to F'^{t) is is also added to F^{t), preserving the ^ relationship by 
Lemma 4. If a jump down occurs, and the appropriate £„ = 1 and WJ} = j, the jth largest element of each 
set is deleted. If e„ = 0, the j largest element of F^{t) is still deleted, while the smallest element of F'^{t) is 
deleted. Again by Lemma 4, the ^ relationship is preserved. Furthermore, this gives a construction of the 
fitness process when r = 1, i.e., the law of max{F^ (t)} , t > is the same as that of (j)l,t > 0. 

This gives a construction with (j>^{t) > (t)^{t),t > 0. In view of (4.2) this proves -^p 1 as t — oo. 
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